Three discretization methods for rule induction
نویسندگان
چکیده
We discuss problems associated with induction of decision rules from data with numerical attributes. Real-life data frequently contain numerical attributes. Rule induction from numerical data requires an additional step called discretization. In this step numerical values are converted into intervals. Most existing discretization methods are used before rule induction, as a part of data preprocessing. Some methods discretize numerical attributes while learning decision rules. We compare the classification accuracy of a discretization method based on conditional entropy, applied before rule induction, with two newly proposed methods, incorporated directly into the rule induction algorithm LEM2, where discretization and rule induction are performed at the same time. In all three approaches the same system is used for classification of new, unseen data. As a result, we conclude that an error rate for all three methods does not show significant difference, however, rules induced by the two new methods are simpler and stronger. 2001 John Wiley & Sons, Inc.
منابع مشابه
Experimental Evaluation of Discretization Schemes for Rule Induction
This paper proposes an experimental evaluation of various discretization schemes in three different evolutionary systems for inductive concept learning. The various discretization methods are used in order to obtain a number of discretization intervals, which represent the basis for the methods adopted by the systems for dealing with numerical values. Basically, for each rule and attribute, one...
متن کاملA Comparison of Three Strategies to Rule Induction from Data with Numerical Attributes
Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm. The LEM2 algorithm works correctly only for symbolic attributes and is a part of the LERS data...
متن کاملA Tuning Aid for Discretization in Rule Induction
This paper examines where a tuning aid can be useful to help discretization of numerical attributes in rule induction, and subsequently improve deduction of induction results. Diierent discretizationmethods use diierent strategies to set up the borders for continuous attributes. They mostly incorporate class supervision to deene the discretization borders. The tuning aid we present uses an unsu...
متن کاملReduct Calculation and Discretization of Numeric Attributes in Sparse Decision Systems
In this paper we discuss three problems in Data Mining Sparse Decision Systems: the problem of short reduct calculation, discretization of numerical attributes and rule induction. We present algorithms that provide approximate solutions to these problems and analyze the complexity of these algorithms.
متن کاملThree Strategies to Rule Induction from Data with Numerical Attributes
Rule induction from data with numerical attributes must be accompanied by discretization. Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm, work...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. J. Intell. Syst.
دوره 16 شماره
صفحات -
تاریخ انتشار 2001